home *** CD-ROM | disk | FTP | other *** search
- ID:13 QEMM-386: Exception #13 explained
- Quarterdeck Technical Note #142
- By: Michael Bolton
- Last revision: 2 March 1993
-
- Q. What is an Exception #13? What is an Exception #12?
- Q. What does the QEMM Exception message mean? How can it help me?
-
- Users of QEMM-386 may sometimes encounter a report that an attempt has been
- made to execute an invalid instruction. It is almost certain that QEMM-386,
- in and of itself, is not the cause of Exception #13 problems, though
- QEMM-386's memory managment may come into conflict with other hardware and
- software on your system.
-
- Quarterdeck Technical Note #232, Exception #13 Advanced Troubleshooting
- (EX13FLOW.TEC) is designed to resolve conflicts in which QEMM-386 may be
- involved. This technical note is its companion; here we explain in detail
- what a processor exception is, how you can interpret the information provided
- by the exception report, and what you can do to remedy the situation in the
- unhappy event that the techniques in EX13FLOW.TEC don't provide relief from
- the problem.
-
- To answer the questions above, it's worthwhile to examine the Exception #13
- report bit by bit.
-
- "The processor has notified QEMM that an attempt has been made to execute an
- invalid instruction..."
-
- Exceptions are the processor's response to unusual, invalid, or special
- conditions in the normal operation of the 80386 processor and others in its
- family. (The 80386 family includes the 80386SX, the 80386DX, the 80486SX, and
- the 80486DX processors; their memory management architecture is essentially
- the same. In this document, the term "386" refers to any and all of these
- processors.) Exceptions cause the 386 processor to stop what it's doing and
- to try to react to the condition that caused the exception. QEMM-386 is
- designed to capture some of these exceptions -- particularly those caused by
- protection faults or invalid instructions, which could cause a program or the
- entire system to crash -- and display a report to the user. When the
- processor encounters an instruction that it does not want to execute, it
- passes control to the protected mode interrupt 13 (decimal) handler.
- QEMM-386's protected mode INT 13 handler posts the Exception #13 message.
- Neither DOS nor Microsoft's EMM386.EXE have a protected mode interrupt 13
- handler, so if an exception occurs using only DOS or EMM386.EXE, your system
- simply crashes and you have no report.
-
-
- Q. What causes an Exception #12 or Exception #13?
-
- "...This may be due to an error in one of your programs, a conflict between
- two pieces of software, or a conflict between a piece of hardware and a piece
- of software...."
-
- The exception reported is most commonly #13, the General Protection Fault
- exception. This indicates that a program has tried to execute an invalid or
- privileged instruction. On the 386 processor, programs can run at varying
- privilege levels, so that the processor can better protect application
- programs (which generally run at lower privilege levels) from crashing the
- operating system or control program (which typically runs at the highest
- privilege level). DOS and QEMM-386 do not enforce this protection, but
- QEMM-386 can report when a program running at the lowest privilege level tries
- to execute a privileged instruction. The result may be a system crash, but
- QEMM-386 does provide a report before the crash happens.
-
- Invalid instructions are harder to classify, for indeed Exception #13 is
- something of a catch-all. Some examples of invalid instructions include:
-
- - 386-specific instructions that are disallowed when the processor is in
- virtual 8086 mode. The processor is in this mode whenever QEMM-386 is in an
- ON state -- essentially when it is providing expanded memory or High RAM.
-
- - A program trying to write data to a segment that has been marked as
- executable or read-only (the data could overwrite program code).
-
- - Trying to run program code from a data segment (if data is read as code, it
- will be a series of meaningless or nonsensical instructions -- which, if
- executed, could jump to invalid addresses or overwrite the operating system)
-
- - Exceeding the limit of a segment. Segments in virtual 8086 mode are not
- permitted to exceed FFFFh (65535 decimal) bytes or to fall below 0 bytes.
- Neither a program instruction nor a memory reference may span the boundary
- of a segment.
-
- It is this last which is the most common; this is a problem also known as
- "segment wrap", which we will discuss later. Again, QEMM-386 is designed to
- trap and report these errors, but it cannot defend against the system crashes
- that they may cause.
-
- Occasionally Exception #12, indicating a stack exception, will be reported.
- This is a protection violation very similar to Exception #13, but is one in
- which the stack segment is involved in some way. Although no easier to solve,
- it is a somewhat less general report than Exception #13.
-
- Very infrequently, an Exception #0 is reported. This is not intentional; it
- is usually the result of QEMM's stack being corrupted while QEMM was trying to
- report another exception, or is the result of some other system error.
-
- It is important to remember that in the vast majority of cases, QEMM-386 is
- not involved with the problem, but is merely reporting it.
-
-
- Q. What do I do now?
-
- "...It is likely that the system is unstable now and should be rebooted...."
-
- QEMM-386 is designed to offer the user the opportunity to terminate the
- offending program, or to reboot the computer, but often the damage has already
- been done by the time that the exception is trapped and reported. In this
- instance, you may find the computer locked regardless of what you choose. If
- the computer is indeed hung, you should write down the information on the
- screen and then reboot the machine.
-
- While QEMM-386's exception reports can be cryptic to non-programmers -- or to
- programmers who have little experience with assembly language -- the
- information that they provide can sometimes be quite helpful. Exception
- reports can help you to identify which program has triggered the exception
- message, what the invalid instruction was, and the state of the processor's
- registers when the error occurred. Armed with this information, you may be
- able to help the developer of the offending application to determine the
- problem that led to the exception, and thus the developer may be able to
- provide a temporary workaround or a permananent fix.
-
- The exception report is divided into three parts --
-
- 1) The vector or class of exception, and its location and error code. The
- location of the exception indicates the address in memory at which the invalid
- instruction was attempted. The program loaded at this address (if indeed a
- program is loaded there) should be noted by running Manifest.
-
- Exception #13 at 1B12:0103, error code: 0000
-
- In this example, the program loaded at address 1B12:xxxx is automatically your
- suspect. Reboot your system in the same configuration as you had when the
- Exception #13 occurred. If the problem happened during an application
- program, don't load the application just yet. Load Manifest instead, and have
- a look at First Meg / Programs.
-
- Memory Area Size Description
- 03D1 - 0465 2.3K COMMAND
- 0466 - 046A 0.1K (04C0)
- 046B - 0483 0.4K COMMAND Environment
- 0484 - 0487 0.1K COMMAND Data
- 0488 - 0498 0.3K DV Environment
- 0499 - 04BE 0.6K DV
- 04BF - 1A38 85K DV Data
- 1A39 - 1A52 0.4K COMMAND Data
- 1A53 - 1AE7 2.3K COMMAND
- 1AE8 - 1B00 0.4K COMMAND Environment
- 1B01 - 7E4F 397K [Available]
-
- The sample Exception #13 above happened in that Available range, so it was the
- program that would have been loaded had we not loaded Manifest -- that is, the
- application program. If you have a TSR loaded low, and the Exception #13 is
- occuring within that TSR's address space, then it is your suspect, rather than
- the application. In any case, the program whose code falls into the range in
- which the Exception #13 occurred likely has a problem of some type.
-
- 2) The second part of the Exception #13 message is the register dump:
-
- AX=0000 BX=0000 CX=0000 DX=0000 SI=FFFF DI=0000 BP=0000
- DS=1B12 ES=1B12 SS=1B12 SP=FFFE Flags=7246
-
- The registers are the temporary storage areas on the 80386 chip which are used
- for calculations and addressing. Each register is two bytes (16 bits) in
- size, so each register is capable of holding a value from 0 to FFFF
- (hexadecimal), or from 0 to 65335 (decimal).
-
- If any registers here are 0000 or FFFF, it's possible that you could be
- looking at a segment wrap. A segment wrap happens whenever a program attempts
- to access -- read from or write to -- something beyond the limit of a segment.
- A word value consists of two adjacent bytes; if a word value were to begin at
- FFFF (which is the last byte of a segment), the second byte of that value will
- be outside the segment -- and an attempt to read from or write to that word
- will thus cause a protection violation. Similarly, a doubleword is four
- adjacent bytes; if any of the last three bytes are outside of the segment
- limit, a segment wrap and a protection violation will occur when an access is
- attempted.
-
- On an 8086 processor, it's actually possible for a segment wrap to occur
- without a protection violation, simply because the 8086 has no hardware
- protection at all. What is the byte after the last byte of a segment? On the
- 8086, it's the FIRST byte of the same segment. (Non-technical analogy for
- poker players: Queen - King - Ace - Two - Three is a straight in the
- penny-ante poker game played when the 8086 processor is dealing. The 386
- processor is a very strict dealer, and does not permit this.) It is possible
- (though unlikely) for a program to continue without a crash on an 8086
- processor when two "adjacent" bytes are actually a whole segment apart; it
- could theoretically be possible on a 386 too, but the exception is generated
- before the memory access can be completed.
-
- This sort of problem is seen most commonly during a string move -- the program
- is copying a whole block of data from one range of addresses to another. You
- may not understand this, and actually it doesn't matter if you don't.
- Briefly, though, SI stands for Source Index; DI stands for Destination Index.
- These two registers are used for string instructions -- instructions that load
- or copy information sequentially. String instructions are extremely powerful
- and useful, since they allow the developer to deal with large amounts of data
- in a single pass. A byte or a word value can be fetched from memory by one
- string instruction, dealt with, and then the result can be copied to a new
- memory location with a second string instruction -- and all this can be
- managed with an extremely tight, fast loop. An entire range of addresses (for
- example, in screen memory) can even be filled with a given value using a
- single instruction. The catch here is that the string instruction is only
- valid as long as the value of the SI or DI register does not fall outside the
- range addressable by these registers. If either one of these tries to exceed
- FFFF (or tries to fall below 0000), as a string is being copied from one
- region of memory to another, you'll get a protection violation.
-
- 3) Instruction: A5 CC 00 00 00 00 00 00 00 00 00 00 00 00 00
- Do you want to (T)erminate the program or (R)eboot?
-
- This is the invalid instruction that the program was trying to execute when
- the processor stopped it. Since most humans don't have a hope of interpreting
- machine language by looking at the opcodes, you can get a better
- interpretation of what is going on by examining this instruction with a
- program that can render machine codes into assembly language. (Well... it's
- better than nothing.) To do so, go into DEBUG; type DEBUG at the DOS prompt.
-
- Enter the values from the Instruction line by typing
-
- E 100
-
- at DEBUG's hyphen prompt, and then entering each byte (pair of digits) from
- the instruction line. Follow each byte with a space.
-
- (As a bonus -- if you're running under DESQview, you can Mark the information
- from the Exception #13 report, and Transfer it into DEBUG running in a
- different Big DOS window.)
-
- If most of the bytes begin with a 4, 5, 6, or 7, there's a good chance that
- you're seeing a program trying to execute text, thinking that text to be code.
- This can happen in several circumstances, but frequent offenders are those
- programs which load code at the top of conventional memory during boot -- and
- therefore during the OPTIMIZE process -- and presume that no program will
- allocate that memory. Programs which place parts of themselves at the top of
- conventional memory typically do so without protecting themselves from
- programs like LOADHI which may need to allocate all conventional memory at
- appropriate times; LOADHI (and programs like it) will overwrite the vulnerable
- code.
-
- As a real-world example, PROTMAN, a program whose purpose in life is to manage
- the loading of various parts of 3Com and MS-LAN networks, did this in past
- versions, as explained in Quarterdeck Technical Note #173, PROTMAN.TEC.
- During the OPTIMIZE process, LOADHI would allocate all conventional memory
- while it was determining the size of the various drivers that were being
- loaded. PROTMAN would jump to what it thought was still its own code, but
- there would be LOADHI signatures there -- text -- and PROTMAN would crash.
-
- You can see the contents of this string if you Dump the instruction you just
- entered; use DEBUG's D instruction to do this.
-
- -d 100
-
- 1DC0:0100 4F 41 44 48 49 53 49 47-4E 41 54 55 52 BF 42 87 LOADHISIGNATUREB
- 1DC0:0110 98 FF 6F E2 E9 FF 00 00-26 21 F1 B3 34 00 AF 1D ..o.....&!..4...
- 1DC0:0120 01 00 D3 E0 0B E8 59 5F-07 B0 00 AA 5F 9D F8 C3 ......Y_...._...
- 1DC0:0130 AA 41 FE 06 AD 90 C3 2E-C7 06 CF 88 00 00 2E 89 .A..............
-
- ASCII codes starting with 2 are generally punctuation marks; bytes 30-39
- represent numeric digits; 3A-3F are punctuation, 41-5A are capital letters,
- 61-7A are small letters. Any instruction made up mostly of these numbers is
- almost certainly text -- and therefore not executable program code. The
- program that is trying to run such an instruction is doing so in error. When
- the instructions are NOT mostly in the 40-80 range, you should try to
- Unassemble them.
-
- -u 100
-
- 20C0:0100 A5 MOVSW
- 20C0:0101 CC INT 3
- 20C0:0102 0000 ADD [BX+SI],AL
-
- This is the killer instruction from the example Exception #13 above. It's
- performing a MOVSW (MOVe String Word) at a point when the SI register is FFFF,
- and that means that it's trying to write a word value to or from the last byte
- of a segment, which (as described above) is illegal.
-
- Other invalid instructions are harder for the non-programmers of the world to
- interpret. Often the first byte of an invalid instruction is 0F -- which is a
- valid protected-mode instruction, but which the processor interprets as an
- invalid opcode if the machine is in Virtual 86 mode. Exceptions of this kind
- showed up more commonly in the past, with programs that were trying to enter
- protected mode without calling the Virtual Control Program Interface. VCPI is
- an industry-standard way for protected-mode software to coexist with 386
- expanded memory managers such as QEMM-386; all 386 memory managers these days
- are VCPI-providers, and almost all protected-mode programs are VCPI users (or
- "clients"). Non-VCPI protected-mode programs include some memory- and
- hardware-diagnostic programs, and programs that use the DPMI memory management
- specification exclusively. Diagnostic programs typically recommend that you
- disable all memory-management software during diagnosis. DPMI programs will
- typically accept VCPI memory management; those rare programs that do not will
- simply refuse to start up under QEMM-386. In such cases, you may install
- QDPMI (the Quarterdeck DPMI Host) on your system; QDPMI is available on the
- Quarterdeck BBS at (310) 314-3227, Compuserve (!GO QUARTERDECK), or large
- local BBS systems.
-
- How can an Exception #13 be fixed? Two Quarterdeck Technical Notes can help
- you determine if you can solve the problem yourself. Quarterdeck Technical
- Note #241, QEMM-386: General Troubleshooting (TROUBLE.TEC) is a good place to
- start. This note describes common problems and possible solutions, and will
- help if the cause of the Exception #13 is a memory conflict or bus-mastering
- issue. Quarterdeck Technical Note #232, Exception #13 Advanced
- Troubleshooting (EX13FLOW.TEC) should help you to determine if there is
- anything at all that you can do yourself to fix the problem.
-
- If you follow the instructions in both of these technical notes completely,
- and the Exception #13 persists, the prospects for a resolution are bleak,
- since the problem is almost certainly a bug in the offending program. If this
- is so, unless you can alert the developer of the program (and make him or her
- understand all this, which might be another task altogether), you can never
- really make the problem go away, although sometimes you may be able to make it
- subside.
-
- Changing the location of the offending program in memory will sometimes help.
- If you're running under DESQview, and you're sure that you've given the
- program enough memory (i.e., all you can give it), try adding 16 to the size
- of the script buffer on page 2 of Change a Program. If you're not running
- under DESQview, try adding an extra file handle or two. The key here is to
- change the location of the program in memory, which can occasionally be enough
- to provide temporary relief from the Exception #13.
-
- There is a substantial caveat: You're not fixing the problem by doing this;
- you're just making it submerge. There's still probably a bug in the offending
- program -- you've just changed it from a bomb to a landmine. If you can
- reproduce the problem consistently, you should still contact the publisher of
- the application with all of the data from the Exception #13 message, and all
- of the data that you can supply about your system and its current
- configuration.
-
- With the exception (no pun intended) of the techniques mentioned above and in
- EX13FLOW.TEC, non-programmers can do little to fix the root cause or even the
- symptoms of Exception #13. If you are unsuccessful in resolving a conflict,
- the information provided by the report should be forwarded, along with a
- Manifest printout and a complete description of your system, to the developer
- of the program that you were running at the time.
-
- ************************************************************************
- * Trademarks are property of their respective owners. *
- *This technical note may be copied and distributed freely as long as it*
- *is distributed in its entirety and it is not distributed for profit. *
- * Copyright (C) 1993 by Quarterdeck Office Systems *
- ************************ E N D O F F I L E *************************
-